The semantically annotated corpus of Polish quantificational expressions
نویسندگان
چکیده
Abstract The paper presents a manually annotated corpus of Polish quantificational expressions. quantifier annotation was conducted on top existing gold-standard data for as its separate layer. This releases the and gives an overview related tools. As far we know, this is first large-scale generalized quantifiers together with their crucial semantic properties, including monotonicity profile. We also discuss potential further use in linguistics cognitive science.
منابع مشابه
A Semantically Annotated Swedish Medical Corpus
With the information overload in the life sciences there is an increasing need for annotated corpora, particularly with biological and biomedical entities, which is the driving force for data-driven language processing applications and the empirical approach to language study. Inspired by the work in the GENIA Corpus, which is one of the very few of such corpora, extensively used in the biomedi...
متن کاملDeveloping a large semantically annotated corpus
What would be a good method to provide a large collection of semantically annotated texts with formal, deep semantics rather than shallow? We argue that a bootstrapping approach comprising state-of-the-art NLP tools for parsing and semantic interpretation, in combination with a wiki-like interface for collaborative annotation of experts, and a game with a purpose for crowdsourcing, are the star...
متن کاملYAWN: A Semantically Annotated Wikipedia XML Corpus
The paper presents YAWN, a system to convert the well-known and widely used Wikipedia collection into an XML corpus with semantically rich, self-explaining tags. We introduce algorithms to annotate pages and links with concepts from the WordNet thesaurus. This annotation process exploits categorical information in Wikipedia, which is a high-quality, manually assigned source of information, extr...
متن کاملA Semantically Annotated Corpus from MEDLINE Abstracts
Automatic information extraction is a key technology to help researchers access the information contained in research papers and to extend databases on substances and biological processes. We aim to build information extraction databases [2] from biochemical papers and their abstracts available from the MEDLINE [3] database. To objectively measure the performance of our systems, we built a corp...
متن کاملGENIA corpus - a semantically annotated corpus for bio-textmining
MOTIVATION Natural language processing (NLP) methods are regarded as being useful to raise the potential of text mining from biological literature. The lack of an extensively annotated corpus of this literature, however, causes a major bottleneck for applying NLP techniques. GENIA corpus is being developed to provide reference materials to let NLP techniques work for bio-textmining. RESULTS G...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Language Resources and Evaluation
سال: 2022
ISSN: ['1574-020X', '1574-0218']
DOI: https://doi.org/10.1007/s10579-022-09578-4